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Abstract 

We study Mertens' own proof (1874) of his theorem on the sum of the recip- 
rocals of the primes and compare it with the modern treatments. 



Contents 

ll Historical Introduction! 



1.1 Euler 



1.2 Legendre and Chebvshev 



1.3 Merten; 



2 The Modern Proof 



2.1 Partial Summation 



2.2 The Relation with tt(x) 



2.3 The First Grossehilfsat zl 



2 

2 
3 
3 

5 

5 
5 
6 



3 Mertens' Proof 



3.1 A Sketch of the Proof 



3.2 Euler-Maclurin and Stirling 



3.3 The First Step of Mertens' Proof 



3.4 Mertens' Use of Partial Summation 



3.5 Proof the the Grossehilfsatz ll 

3.6 The Grossehilfsatz 2T 



3.6.1 Merten's proof 



3.6.2 Modern Proof 



3.7 The Formula for the Constant H 



3.8 Completion of the Proof! 



4 Retrospect and Prospect 



4. 1 Retrospect! 



£2 Prospect 



9 
10 
14 
18 
18 
21 
24 
25 

25 

25 
26 



1 



1 Historical Introduction 



1.1 Euler 

In 1737, Leonhard Euler created analytic (prime) number theory with the publication 
of his memoir "Variae observationes circa series infmitas" in Commentarii academiae sci- 
entiarum Petropolitanae 9 (1737), 160-188; Opera omnia (1) XIV, 216-244. Theorema 
7 states: 

"// we take to infinity the continuation of these fractions 

2 - 3 ■ 5 • 7 ■ 11 ■ 13 - 17 ■ 19 • ■ • 



1 ■ 2 • 4 • 6 - 10 - 12 • 16 ■ 18 - • ■ 

where the numerators are all the prime numbers and the denominators are 
the numerators less one unit, the result is the same as the sum of the series 

11111 

H 1 1 1 1 1 . 

2 3 4 5 6 

This is the wonderful identity which, today, we write 0, [H], 0: 

(1.1.1) 




Here p > and the product on the left is taken over all primes p ^ 2, while the right 
hand side is the famous Riemann zeta function, ((1 + p). The modern statement is nice, 
but does not have the sense of wonder that Euler's statement carries. Yes, it is not 
rigorous, but it is beautiful. 

Euler's memoir is replete with extraordinary identities relating infinite products and 
series of primes, but our interest is in his Theorema 19: 

"Summa seriei reciprocae numerorum primorum 

11111 1 

2 + 3 + 5 + 7 + lT + 13 +etC - 
est infinite magna, infinities tamen minor quam summa seriei harmonicae 

„ 1111 

H 1 1 1 h etc. 

2 3 4 5 

Atque illius summa est huius summae quasi logarithmus. " 
We translate this as our first formal theorem. 
Theorem 1. The sum of the reciprocals of the prime numbers 

11111 1 
2 + 3 + 5 + 7 + Tl + 13 +etC - 
is infinitely great but is infinitely times less than the sum of the harmonic series 

1111 
1+ 2 + 3 + 4 + 5 +6tC ' 
And the sum of the former is as the logarithm of the sum of the latter. 
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□ 

The last line of Euler's attempted proof is: 
". . . and finally, 

11111 , , 

- H 1 1 1 1 = lnlnoo". 

2 3 5 7 11 

(We have written "lnlnoo" instead of Euler's "Woo.") 

It is evident that Euler says that the series of prime reciprocals diverges and that 
the partial sums grow like the logarithm of the partial sums of the harmonic series, that 
is Ylp< x p g rows like minx. Of course, this implies (trivially) that there are infinitely 
many primes, since the series of reciprocal primes must necessarily have infinitely many 
summands. Moreover, it even indicates the velocity of divergence and therefore the density 
of the primes, a totally new idea. 

This was the first application of analysis (limits and infinite series) to prove a theorem 
in number theory, the first new proof of the infinity of primes in two thousand years (!), 
and opened an entirely new branch of mathematics, analytic number theory, which is a 
rich and fecund of modern mathematics. 



1.2 Legendre and Chebyshev 

The first quantitative statement of Euler's theorem on the sum of the reciprocal primes 
appeared in Legendre's Theorie des nombres (troisieme edition, quatrieme partie, VIII, 
(1808)), namely: 




where G is a given real number and C is an unknown numerical constant. Legendre 
gave no hint of a proof nor of the origin of the mysterious constant "0.08366." 

In 1852, no less a mathematician than the great russian analyst Chebyshev [I] 
attempted a proof of Legendre's theorem, but failed. The problem of finding such a 
proof became celebrated, and the stage was set for its solution. 



1.3 Mertens 

In 1874 (see [Hj) the brilliant young Polish- Austrian mathematician 1 , Franciszek 
MERTENS, published a proof of his now famous theorem on the sum of the prime recip- 
rocals: 

Theorem 2. (Mertens (1874)) Let x ^ 1 be any real number. Then 




;i.3.i) 



^^He was a professor of mathematics for over 20 years (1865-1884) at the Jagiellonian university in 
Cracow. At that time, Poland was partitioned among Prussia, Russia and Austria, and Cracow was in 
the austrian zone - there was not an independent polish state then. Mertens' wife was polish and he 
spoke polish as well as german. Then he went to Graz to become rector of the politechnique there. ^B] 
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where 7 is Euler's constant, /i(m) is the Mobius function, ((m) is the Riemann zeta 
function, and 



4 2 
\S\ < ,_„_, , 7T + ~ 



ln([x] + 1) [a;] In [a;] 



(1.3.2) 

□ 



(We write [x] := the greatest integer in x.) We have slightly altered his notation. 
Today we write the statement of Mertens' theorem in the form [Oj, |H]: 

Theorem 3. 

lnlnx 



B := lim ( V - 

x^oo \ — ' p 



zs a well-defined constant. □ 

An alternative more precise statement of the modern theorem is: 
Theorem 4. 



p^.x 

where 



V- = lnlnx + B + O (- ] 

*— J p V In x / 



□ 



The modern presentations of MERTENS' theorem, [S],[B], [Oj jl 1 j - include: 

1. no discussion of an explicit numerical error estimate (such as Mertens' 5). 

2. no computation of B, in particular, a proof of the wonderful formula: 



OO 



B = 7 + £ Mn) K£M (1 . 3 .3) 



n 

n=2 



Mertens used this formula to compute the value: 

B « 0.2614972128. 

3. no hint of how Mertens, himself, proved his explicit theorem. 

In this paper we will present a self-contained motivated exposition of Mertens' 
original proof and compare its strategy, tactics, and details with the modern approach. 
Mertens' proof is brilliant, insightful, and instructive. It deserves to be better known 
and our paper attempts to achieve this. 2 

2 Mertens' paper also contains a proof of his (almost) equally famous product-theorem: 

p^G P 

where \8'\ < i n (g +1 ) + a\nG 2<3' there is nothing new in his treatment that does not appear in 
the theorem we are dealing with, so we do not discuss it here. 
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2 The Modern Proof 



2.1 Partial Summation 

Modern prime number theory, indeed number theory in general, has developed a system- 
atic approach to the computation of finite sums of number theoretic functions by use of 
what is called "Abel summation," or "partial summation." We follow [§]. 

Theorem 5. (Abel Summation) Let y < x, and let f be a function (with real or complex 
values) having a continuous derivative on [y,x]. Then 

o(r)/(r) = A(x)f(x) - A(y)f(y) - f A(t)f (t) dt (2.1.1) 

y<r^x 

where the integers a(r) are given, and where 

A(x):=J2a(r). (2.1.2) 



□ 



We will apply this technique to the sum 



2.2 The Relation with ir{x) 

Example 1. Take y :— 2, in the Partial Summation formula, and take 



a(r) :- 



1 if r = p 
if r 7^ p 

that is, a(r) is the characteristic function of the prime numbers p. Moreover, take: 

m := -. 

r 

Then we conclude that 

A{x) 

is equal to the number of prime numbers p ^ x, i.e., the prime counting function tt(x). 
Therefore, the formula for Abel summation gives us: 



1 tt(x) f x ir(t) 



t^P~ x J* t2 



dt, (2.2.1) 



a very pretty equation relating our sum to the famous function 7r(x). Unfortunately, in 
order to apply it we must know upper and lower bounds for n(x), and the study of such 
bounds is the subject of the Prime Number Theorem, something much deeper than our 
topic. 
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2.3 The First Grossehilfsatz 



Example 2. Following Mertens (in a slightly different context: see 3.4) we again take 
y :— 2, but this time we take 



lnp 



a(r) :- 



V 



if r =p 



if r^p 



and 



Then 



f 

lnr 



\np 



p^x 



p 



(2.3.1) 



Therefore, the formula for Abel summation gives us: 



p^x P 



A(x) f x A(t) 



lu.r j. 2 



t(\nty 



dt, 



(2.3.2) 



a nice formula, but with A(x) the slightly more exotic function given in (2.3.1). In 
his paper, Mertens proves two " Grossehilfsdtze n (in Landau's marvelous German 
phraseology: the English "fundamental lemmas" does not carry the same force.) The 
first one deals with our A(x). 



Grossehilfsatz 1. 



= i n X + R(x), where |R(x)| < 2. 

p^x P 



(2.3.3) 



□ 

The interest in this is the explicit numerical error estimate, < 2, which, as we 

will see, is quite good. 

We will give Merten's nice proof of this result later on (see 3.5), but for now we 
assume it to be true. 

Then, if we put , 

hip 



which means (by (2.3.3)) that 



p 

\R(t)\<2, 
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by (2.3.2), we conclude that 



^pl _ \nx + R(x) r x \nt + R(t) 
^ p lnx In t(\nt) 2 

pi^x 

, RCx) r i , r R(t) , 

lnx J 2 tint J 2 t(\nt) 2 

R(x) , , , , f°° R(t) , f°° R(t) , 

= l + -^ + lnlnx-lnln2 + / - v dt - / - v ;„ dt 



where 



lnx J 2 t(\nt) 2 J x t(\nt) 2 

lnlnx + l-lnhi2+ / - v ' dt + -^- / - v (- dt 

A t(lnt) 2 lnx A t(lnt) 2 

V v ' V V ' 

a constant B < - 2 - -i — 2_ = JL 

^ lnx lnx lnx 

In In x + B + 5, 

1*1 < ' 



lnx 

We have proved: 



Theorem 6. There exists a constant, B, such that for all real numbers x ^ 2, 

= lnlnx + S + (5, (2.3.4) 



V 

p^x 

where 

\M < 

lnx 



\b\ < -i-. (2.3.5) 



□ 



This is an explicit form of Mertens' theorem (our Theorem 2) with a somewhat 
better error term than (1.2) in Mertens' original statement! Unfortunately, the form of 
the constant 

B := 1 -lnhi2+ / -^-r dt 



t(\nt) 2 

gives no clue as to how to compute it, much less that it has the form 7 + C, for some 
constant C, as we saw in equation (1.3.3). This shows both the advantage, and the 
disadvantage of the modern approach: it is systematic and gives a (slightly) better error 
term with little effort, but it gives no algorithm for the explicit computation of the 
constant B. 

There are modern treatments [E], [B], jH] that show the formula 

B = j + C 

to be valid, but there is no modern textbook treatment of the formula (1..3). There 
is a beautiful recent paper [T3] on this formula and its computation which should be 
consulted. 
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3 Mertens' Proof 



3.1 A Sketch of the Proof 

Mertens starts with the convergent "prime zeta function" 

y— 



V 



where p > 0, and writes its partial sum for primes p ^ x as: 

✓ r>l+p < «1+P < nl+P (3.1.1) 



p^x P p>x 



and then studies the RHS as p — > 0. It is fairly easy to show that 

E^= ln (^-#+^)> ( 3 - 1 - 2 ) 



where 

>{C(n)} 



i/^^M")^^ (3.1.3) 

n=2 

It ta&es work(!) to show that the "remainder," 

E^ = ln Q) -lnlna;-7 + 5 + o(p). (3.1.4) 

Equations (3.1.1), (3.1.2), and (3.14) show 

]T^ = lnlnx + 7- tf + 5 + o(p), (3.1.5) 

p^x 

and letting p — ► gives Mertens 's theorem. 

The equations (3.1.2) and (3.1.4) show that the "Mertens constant," B, is the sum 
of two constants, 7 and —H, and each comes from a different part of the "prime zeta 
function." It is this fact that makes Mertens' theorem hard to prove. 

Our presentation follows Mertens quite closely, although we fill in several details. 
His mathematics is striking and beautiful, a tour de force of classical analysis. 

3.2 Euler-Maclurin and Stirling 

In this section we will cite the versions of the Euler-Maclaurin formula and Stirling's 
formula which will be used in Mertens's proof. The proof of both can be found in [THj . 

Theorem 7. (Euler-Maclaurin) Let f(t) have a continuous derivative, f'(t), fort ^ 1. 
Then: 

52f(n)= [ X f(t)dt+ f\t-[t])f'(t) dt + f(l)-(x-[x])f(x). (3.2.1) 
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□ 



Theorem 8. (Stirling's Formula) The following relations are valid for all real x ^ 4 and 
all integers n ^ 5: 



ln( 1 • 2 • 3 • • • [x] ) < x In x + - In x - x + In v 7 ^ + 



1 



I2x 



( 



2 In 1 • 2 • 3 • • • 



j > :r In re — a; In 2 — In a; — x + din V2n + In 2 



x-2 



1. /— A 



ln(n!) = nlnn — n + -lnn + lnv27r + — — , |A| < 1 



(3.2.2) 
(3.2.3) 
(3.2.4) 
□ 



3.3 The First Step of Mertens' Proof 

MERTENS begins with EULER's marvelous identity: 




(3.3.1) 



as indeed does most of analytic prime number theory. Here p > and the product on 
the left is taken over all primes p ^ 2. The right hand side is the famous Riemann zeta 
function ((1 + p). 

Now, 



(3.2.1) 



x 



x=l 



1+P 



+ i + 0(-i), 9 e [0,1] 



+ 1-0 



- + 1-0 
p 



thus 



n 



2 1 - 



P 



1+P 



1 + 



(3.3.2) 
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Taking logarithms of both sides we obtain: 



/ \ 

i 



i - 



i 



+ 



OO _.00_. _.00_. 

El 1 V ~V 1 1 V ~v 1 

pl+P 2 ' ^ pZ+tP 3 ' 2-^ p3+3p 



Therefore, 



In 



l + o(p) 



1 + 



+ 



1 1 



3p 



(3.3.3) 



Mertens wants to let p — > on both sides of (3.3.3). That way, formally, the left hand 
side becomes ^ 

p 

the sum he wishes to study, while the right hand side becomes 

l + 0( P )\ 1 1_ 1 1_ 



lim In 

So Mertens defines 



P 



1 ^ p 1 3 ^ p" 



1-1 i-i 



2 V ? 



3 



Combining this result with (3.3.3) we obtain 



oo 1 

V — 

pl+P 



hi ( - 



-# + o(p). 



which is the equation (3.2) cited earlier. 



(3.3.4) 



(3.3.5) 



3.4 Mertens' Use of Partial Summation 

Mertens wants to compute the remainder: 

y—- 



p>x 



V 



His object is to show that the "remainder" series is, effectively, the series 

1 



E 

n=G+l 



n 1+ P Inn' 
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where G := [x\. That way he reduces his problem to the study of an infinite series over 
all the integers, something hopefully more amenable to analysis. He does this by using 
partial summation. The form of the partial summation formula which he uses is 



a(n)f(n)= £ [A(n) - A(n - l)]/(n) (3.4.1) 

n=G+l n=G+l 



where he puts: 

lnp 



<*(«):= < P lS " = P 
if n 7^ n 

and 

fin) := . 

Then, if, with Mertens, we put G :— [x], we perform an almost dizzying sequence 
of series transformations to obtain: 
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E nl+P 

n=G+l 
(2.1.1) 



V 



[A(n) - A(n - 1)] 
n p Inn 

A(G) 



E A (™ 



i 



(G+ l)ln(G+ 1) ' n ^. 1 " VV l^hin (n + l)^ln(n + 1) 



Grossehilf satz 1 

r i 



(G+l)'ln(G+l) + ^ lnn + jR ( n ^ {nPlnn (n + 1)" ln(n + 1) 

MG) ^ „, , f 1 1 

n=G+l k 



(G+l)"ln(G + l) 



n^lnn (n + l) p ln(n + 1) 



V < — 



1 



^ (l-^l) 



n=G+l 



n? (n + I)/ 5 



(n + l)Mn(n + 1) 



(n+l^+P ln(ra+l) ~ 2n(n+l) l +P ln(n+l) 



|A|<1 



A(G) 



(G + iy\n(G+l) 



+ E *( 

n=G+l 



1 



n=G+l 

00 



+ E 1 

n= 

00 

E 



1 



n^lnn (n + l) p ln(n + 1) 
1 A 



+ 



n" (n + iy (n + iy+p\n(n + l) 2n(n + 1) 1+ ^ ln(n + 1) 

1 



1 ln(G + 1) - A(G) 



n=G+l 

00 

*■ E 



Inn (G+ l)^ln(G + 1) (G + 1) 1+ p ln(G + 1) 
1 



+ 



+ 



n=G+l 

00 

+ E R ( 

n=G+l 

and we have proved: 
Theorem 9. 



2n(n+ l) 1 +/ , ln(n + 1) 

1 1 



n 



nPlnn (n + 1)p ln(n + 1) 



00 

y J-= y - 

' ^ n 1+ ^lnn 



P>G+1 



n=G+l 



+ 3? 



(3.4.2) 



where 



3? := 



ln(G + 1) - A{G) 1 

(G + l)'ln(G + 1) ~ (G+iy+P ln(G+ 1) + 

x ■ y — — — -+ y i2(nw— =- 



2n(n + l) 1+ ^ln(n + 1) j-f [n^lnn (n + l)^ln(n + 1) 

7i — \jj I _L T?. — C_j I _1_ 



□ 
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Concerning this rather formidable error term, 3ft, Mertens writes "Fiir 3ft es leicht 
eine obere Grenze anzugeben. . ." ("It is easy to obtain an upper bound for 3ft. . .") 
He goes on to say that the reason is that by the Grossehilfsatz 1, the numerical value of 
R{n) can never exceed 2. Indeed, as p — > + : 



ln(G + 1) - A(G) 1 R(G) 

(G + TyMG + T) ~ (G + iy+p hgTY) ~ ~(g + iy\n(G + 1) 

< JL 
G 2 



+ 



ln|l + - 



1 



G+l 



< 



(G+l)"ln(G + l) 

2 1 

+ 



ln(G+l) G ,2 ln(G + l)' 



and 



E 

n=G+l 



1 

2n(n + l) 1+ ^ln(n + 



1) < 2 2-? 1 nk 

n=G+l k 



1 



Inn (n + 1) ln(n + 1) 



2(G + l)ln(G + l) 



and 



£^ W \n^lnn (n + l) p ln(ra + 1) J ^ \lnn ln(ra + 1) J 



n=G+l v v / \ / -» n=G+l 

2 



ln(G + l)' 

where we used telescopic summation in the last two estimates. Finally, if G > 2, then 
1/1 1 \ 1/11 



ln(G + l) VG 2 2(G + l)y ln(G + l) V^ 2 2G 

1 / 1 2. 

< ln(G + 1) \2G + 2G 
1 



Gln(G + l) 

Therefore, we have proved the following error estimate: 
Theorem 10. 



□ 
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3.5 Proof the the Grossehilfsatz 1 



We have used the Grossehilfsatz 1 on several occasions and the time has come to prove 
it. Starting with the standard definition: 



p^x 

we will use Chebyshev's technique to prove: 
Theorem 11. 

9(x) < 2x. 

Proof. The proof is based on the equation 

ln(l • 2 • 3 • ■ ■ [x]) = 9{x) + 9(y/x) + 9(^x) + 




(3.5.1) 



(3.5.2) 



+ ••• 

To see why this latter equation is true, define: 

X (x) := 9(x) + 0{y/x) + 6{Vx) + 



(3.5.3) 
(3.5.4) 



Then we use a well-known theorem of Legendre jH] : the prime number p divides the 
number n\ exactly 



n 




n 




n 




+ 




+ 




.P. 




p2 




p3 



+ 



times. Therefore, 



ln(M0 = £ 

p^x 





X 




X 




( 




+ 








.p. 




p2 





lnp 



Here, the second member represents the sum of the values of the function lnp taken 
over the lattice points (p, x, u), where p is prime, in the region p > 0, s > 0, < u ^ 

The part of the sum which corresponds to two given values of s and u is equal to 9 ( ; 
the part that corresponds to a given value of u is equal to x (-)• 
Therefore, 



ln(l ■ 2 ■ 3 - • • [x]) - 2 In (1 ■ 2 ■ 3 
But, 



'X\ (X 



X 



X 



X 



x\ fx 

*x[- A ), x(s 



X 



x 
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and therefore 



X(x) - X ( 2 ) < ln (! • 2 • 3 • • • [x]) - 2 In ( 1 • 2 • 3 



Applying Stirling's formula (3.2.2) and (3.2.3) we obtain that for all x ^ 4: 

3 , y— , 2 1 

+ 



x( x ) — X ^ 2~ J < ^ hi 2 + - In x — In v27r — In 2 + 



x - 2 12x 



r 3 2 1 

< x - < (1 - ln2)x In x + In a/27T + In 2 

\ v ; 2 x-2 12x 

< x 



But this same inequality can be verified directly for x < 4. Therefore, we have proved 
the general inequality: if x > 1, then 



X(x) -X (|j < x. 



(3.5.5) 



OC fXj fXj tC 

We now substitute x, — , — , — , • ■ ■ for x until we reach a term — which is less than 2. 

'2 4 8 2 m 
We then add up the inequalities 



xO) -x (I) 



X 



X 



< X 



< 



4 



< 



< 



and we obtain 



and so all the more is 



( 1 1 1 

X(x) < x 1 + - + tH h — 

v ; ^ 2 4 2 m 

< 2x, 



fx) < 2x 



□ 

Chebyshev, himself, proved [I] that 

0.904x < 6(x) < 1.113x 

for x ^ 38750. 

Now we are ready to complete the proof of the Grossehilfsatz 1. We use the in- 
equality for 6(x) and Legendre's theorem again. This latter implies that 



Inn! = — lnp+^^ — lnp + 



V 



n 

~3 



V 



hip + 
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If we write 



n 

■ = r pi 
P 



and use Stirling's formula (3.2.4), we obtain 



Inn — 1 H In n + 

2n 



n 12n 2 p p n ^ 



p^n pi~n 



hip + 



(3.5.6) 



Here, |A| < I. We rewrite this as: 



In n — — - = 1 In n 

p 2n 



In V27T A 1 



p^n 



n 



12n 2 



n 



^r p lnp+i 



p^n 



p 2 ^n 



P 



\np + 



{ ^ In p I 
Inn — > > 



(3.5.7) 



is contained 



between the upperbound 



and the lower bound 



Elnp lnp 
p2 p3 

p 2 ^.n p 3 ^n 



1 v ^ lnp 

n p 

p^n 

Now, on the one hand, by Theorem 11, 



J]lnp< 2n, 

p^n 



while, on the other hand, 
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E-i OO i OO , OO n OO , 

In p v-^ in p v—v In P In p In p In p 



p 



p 3 ^n 



P' 



oo 



P 



p^2 y p>2 * 



p^2 



P 



< 



hp 1 mp ^ "tip 1 Inp 



p>2 



p^2 



p>2 



p^2 



/ oo 



3 / l n P 



\p^2 



P 



p^2 



2^ P 2 V + P 2+ P 4 + "J 



, r 

P>2 | 1 



3 
2 



£ 

n=l 



Inn 



7r 



oo ^ 

' » n- 



rr 



n=l 

3 0.9375482543.. 
2 ^ 



<-<i. 

7l Z 



The penultimate equality is the logarithmic derivative of Euler's identity at 
Therefore, we have proven that for n > 4, 



Inn — 



p<n 



hip 

P 



< 2 



Finally, for 1 ^ n ^ 4, (see pQ) 



1 lnv^ 7rn A 

1 2 I2^ >0 



because 



and 



and therefore, 



In v / 2~7m In 2n In 7r In 2 In 2 

— = + < + = In 2 

n 2n 2n 4 4 

A 1 

< 



12n 2 48 



In \J2im A , 1 

— + j < ln2 + — < 1. 

2 12n 2 48 



This completes the proof of the Grossehilfsatz 1. 
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The reader will observe that the more accurate inequality of Chebyshev, 6(x) < 
1.13x is of no use in improving the bound which Mertens obtained in the Grossehil- 
fsatz 1, since it is used to obtain the lower bound, only, while the upper bound is of 
the form 1 + (1 — e) where e is very tiny, and for which the results of Chebyshev are 
irrelevant. Using the most advanced techniques available, Dusart [S] has proven: 




= -1.3325822757... 



So the value 2 given by Mertens as an upper bound for the absolute value of the constant 
is pretty close to the true value. 



3.6 The Grossehilfsatz 2 

We state: 

Grossehilfsatz 2. 

00 1 A 

V -r—. : = lnlnG + 7 + — — + o(p). (3.6.1) 

n=G+l 

where 7 is Euler's constant, and |A| < 1. 

We offer two proofs. Mertens' original proof, which displays his technical virtuosity, 
and our own modern proof. 



3.6.1 Merten's proof 

Proof. This is another marvelous tour de force. 

The first step is to obtain an estimate for the "remainder" in the Riemann zeta- 

function: J2n=G+i J+t- 

We begin by noting that the binomial theorem gives us 



1 

nP 



(n 


+ iy 




1 


(n 


+ 1)' 




1 


(n 


+ 1)' 




1 


(n 


+ 1)" 




1 


(n 


+ 1)^ 



n+ 1 



n 



n + 1/ 
1 



n + 1 
P I 



p(p+l)(p + 2) 



l!(n + l) 2! (n + iy 



3! 



n + 1 



+ 



+ 



1 



+ 



P(P + 1) 
2! 



(n+ 1) 2 +p 



+ 



p(p+l)(p + 2) 
3! 



(n + lf+P 



+ 
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and transposing the first term on the left to the right hand side and dividing both sides 
by p we obtain: 

_L_ _ 1 = 1 1 (P+l) 1 (p+l)(p + 2) 1 

pn" p(n+iy l!(n + l) 1+ " 2! {n + l) 2 +o 3! (n+l) 3 +^ 

If we sum this last equation from n = G to n = oo we obtain: 

oo 

r 

where 



n=G+l F 



^_(p + i) v 1 i (p + i)(p + 2) f> 1 i r , fi * 

^ " 2! 2^ (n + i)2+p+ 3! 2^ ( n + i)3+p + -" ^- b -^ 

n=G+l y ' n=G+l y ' 

We have now obtained the promised representation of the "remainder." The next step 
is as marvelous as it is unexpected. We integrate (3.6.2) with respect to the exponent, p ! 

The summand, , can be obtained from the identity: 

* x 1 , 1 1 

dt 



n i+t n 1+p \nn n 2 lnn 

If we apply this to (3.6.2) and (3.6.3) by integrating them from t = p to t = 1 we 
obtain 



OO OO 

y - y 1 

z— ✓ n 1+p \nn ^— ' n 2 lnn 

n=G+l n=G+l 



poo i poo -\ pi 

poo 1 poo 1 pi 



X-^ — 

x:=tlnG 

pOO j pOQ (- J *\ /'OO /»1 



/pinG eX - 1 ipinG 1 e x - 1 are 2 I .A tG t 



dx- I \ dx- — dt- W dt 



"Ml "75 



Gp- 



1 



11 ' ' G? / L } e x - 1 xe x \ dX+ L \ e x -1 xe x ( 



p\nG 



V 

2 



tG 



7 (Euler's constant) < plnG if p < InG 
dt - / 3?' dt 



1 



= In Q -lnlnG-7-^ ^ dt - J W dt + o(p) , 
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since 



" ln ( 1 "^) =-ln(l-e-' lnG ) 



-ln(l -{1 -p\nG + o(p)}) 
— In p — In In G + o(p) 

In Q -lnlnG + o(p). 



and therefore, 



V — — = ln - -lnlnG-7- / — dt + V — / 3?' dt+o(p) 



e = error 



(3.6.4) 



This shows where the Euler's constant component of Mertens' constant B comes 
from. Namely, from a subtle and delicate trick of adding and subtracting the nonobvious 
integral f~ nG -^i dx to and from the sum Er?=G+i w i+pinn - 

Now we estimate the error: 



„! „! r oo oo 

/ *<*< / E -« + E ^r + " 

y, /_1 1_\ y, /_1 1_\ 

^— ' \ n 2 In n n 3 In n / ^— ' V n 3 In n n 4 li 

n=G+l v 7 n=G+l v 



= V 

■ / — ' n 2 In n 

n=G+l 



oo 

< E 

n=G+l 
1 



(n — 1) ln(n — 1) n In n 



GlnG" 



and 



Therefore, 



,oo x 



•oo y 

dt < I — dt 



l G t GhiG 
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i „— 11 •* p 



dt 



n=G+l 
°° 1 X 

_ \ v — — 

n=G+l 
_ A3 — A2 

" G\nG 
1 

< GlnG 



where < A^ < 1 for k = 1, 2, 3. 
We have shown: 

V -r— \ = lnf-^| -lnlnG -7 + —^— + o(p). 

n=G+l VF/ 

where |A| < 1. This completes the proof of the Grossehilfsatz 2. □ 
3.6.2 Modern Proof 

It may be of interest to insert a modern proof of Grossehilfsatz 2 based on a simple 
form of the Euler-MacLaurin formula as given by Boas j3j • 

Theorem 12. Let f(t) be positive for t > and suppose that \f'(t)\ is decreasing. If 
Y^n=i f( n ) ^ s convergent and if 

R n :=f(n + l) + f(n + 2) + ---, 
then there exists a number 9 with < 9 < 1 such that the following equation is valid: 

roc n 

R n = / f{t) dt + -f'(n + l). (3.6.5) 

□ 

In the coming computation, we will use the following results. For fixed G, 

(<2 + ~) ' ' = (G + l)-" = l + o(p) (3.6.6) 

since, for any contant, a, 

(G + aY P = e - pHG+a) = 1 + p\n(G + a) - ^{pln(G + a)} 2 + ■ ■ ■ = 1 + o(p) 
Moreover, by Taylor's theorem 

ln(l +x) = x- -x 2 . (3.6.7) 
2 
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where < A < 1. Finally, 



< lnu , . 
" = / — dv (3.6.8) 



which follows from the change of variable x := e v in the standard integral 

f 1 1 
—7 = / In In — dx, 

Jo x 

which appears in Havil [Jj, p. 109. 

Then, substituting in (3.6.5) and integrating by parts with 



u '.= x p , dv : 



dx 



xlnx 

we obtain 
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OO 

V i 

^— ' n l+ P\nn 

n=G+l 



1 7 

ax + 



/ G +| x 1+ p\yix 8 [rc^lnx J X=G+1 

minx ^ Z" 00 (mmx)(-p) f 1 



XP 



x=G+ ± 



G+i XP+ 1 ' 8(G + 1) 2 +p{ r ln(G+l) 



lnln(G' + i) f°° (minx) f 1 



(G + 1)p J G +\ x p+1 8(G+1) 2 +p{ ' ln(G + l) 

(3.6.6) . . /_ \\ f°° (In lux) , f, 11 

= - m m (g + i) + p jT ^ - + (0 < * < 1} 



- l ^ G - ln < 1 + ^^j +p J GH ^ dx -WTW + 0{p) 

(3-6.7) , , ^ # 2 r 00 (lnlnx) , 9 1 . . . „ . 

= - lnlnG -2^ + ^ G+| W^-4(GT^ + 0(/?) (0< ^ 2<1) 

^-lnln G + p/^^^-^ + o(p) (0<*<1) 
^-lnln G + f ^ + f !^ 



= -]n]nG+ ,^\. + /°° — ^--^- + o(p) 

(3.6.6) . 1 . . „ Z" 00 ln« , #3 . , 

= In- -In In G+ / dv- —^— + o(p) 

. 1 , , _ f°° \nv J fPMG+h) \ nv 3 

= In mlnG + / av — / av — — — — + o(p) 

P Jo e« Jo e v GdnG 

(3.6.8) ln _ _ lnlnG _ 7 + G ^ _ _1_ + (p) 
= lnl-lnlnG- 7 - J^ + (p) 



□ 



Observe that this method produces the dominant terms 

In-, — lnlnG, Euler's constant = 7, 
P 

almost automatically, without the nonobvious and tricky (but beautiful and clever) 
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artifices employed by Mertens, while the error term, — cine w ^h a sign, appears 
with virtually no effort. The reason is the power of the half-interval version of the 
Euler-M aclaurin formula combined with the use of integration by parts. I think that 
Mertens would have liked this proof. 



3.7 The Formula for the Constant H 

Mertens computes the constant B := 7 — H by finding a rapidly convergent series 
for H. The paper treats the computation exhaustively. However, they do not give 
Mertens' own derivation, so we develop it here. Define: 



p^2 * n=l 



Then, by (3.3.4) 



H = x 2 + x 3 + x 4 + x 5 + xq + x 7 + x 8 H (3.7.1) 

iln{C(2)}= x 2 + +x 4 + +x 6 + +x 8 + --- (3.7.2) 

iln{C(3)}= x 3 + +x 6 + +■■■ (3.7.3) 

^ln{C(4)}= +x 4 +x 8 + --- (3.7.4) 

and so on. Now, let [i{n): 

1. have the value 1, if n = 1, or has an even number of distinct prime divisors. 

2. have the value — 1 if n has an odd number of distinct prime divisors. 

3. vanish if n is equal to a prime divisor. 

Moreover, let 1, d, d', ■ ■ ■ be all the divisors of n. Then it follows from the definition 
of the numbers /x(l), /i(2), /i(3), • ■ ■ , that for any integer n greater than 1, 

Ml) + 12(d) + fi(d') + ■ ■ ■ = (3.7.5) 

Now, if we multiply the equations (3.7.1), (3.7.2), (3.7.3), etc. by //(l), /x(2), /i(3), 
etc., respectively, and add up the resulting equations and use (3.7.5), we see that xi, x 2 , 
£3, ... all drop out and we obtain: 

# " ^WC(2)} ~ ln{C(3)} - J ln{^ 

Therefore, he have proved: 
Theorem 13. 



H = \ ln{C(2)} + \ ln{C(3)} + \ ln{C(5)} - \ ln{C(6)} + \ ln{C(7)} - 1 ln{C(10)} 
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□ 

We observe that the absolute convergence of the series in question allow the elimina- 
tion of the XkS. Using the published tables of Legendre ^2] of the values of ((m) to 
fifteen decimal places, Mertens computed the value: 



and therefore, 



H « 0.31571845205, 



B = j-Htt 0.2614972128. 



3.8 Completion of the Proof 

Now we follow the sketch in 3.1. 

y — = y —-Y — 

p^x P Pl?x 

= ln (i)-„ + o(p) -£-I 



(by (3.3.5)) 



OO 



hi 



(Vj-H + aV)- E (3-4.2)) 

VK/ n=G+l 



In In G + 7 - H + 



A 



GlnG 

lnlnG + 7- # + <5 + o(p) 



-ft + o(p) (by (3.6.1)) 
(by(3.4.3)) 



where 



\8\< 



4 2 
+ 



ln(G + l) GlnG' 



Letting p — > we obtain 




This completes Mertens' proof of Mertens' Theorem. 



□ 



4 Retrospect and Prospect 
4.1 Retrospect 

Is this proof not stunning? The basic idea, totally different from the modern method, is 
to work with the convergent "prime zeta function" and study the remainder as p —>■ + . 
The modern proof is a direct use of partial summation on the given sum. 



25 



Mertens' proof is quite natural in approach, and the constant H appears quite in- 
evitably. The series computations and the manipulation of inequalities are breathtaking. 
His use of partial summation is brilliant; indeed, it was hailed as a new technique in 
prime number theory by contemporaries pQ. Finally we signal the repeated clever use of 
telescopic summations in the estimation of error terms. 

Any contemporary analyst can marvel at and be instructed by Mertens' "arabesques 
of algebra," a telling phrase due to E.T. Bell |2j to describe the manipulations of 
Jacobi in the theory of elliptic functions to discover number-theoretic theorems, but 
equally applicable to Mertens' mathematics in this memoir. 

All the techniques Mertens used are now standard tools for the analytic number 
theorist (among others), but it is a joy to see them used together in a single focused 
effort to obtain his one towering result. 

4.2 Prospect 

Modern work on MERTENS' theorem has concentrated on improving the error term. The 
best result to date which has been completely proven is due to Dusart jj]: 

Theorem 14. For x > 1 

V- -Inhix-B ^ - ( ^ + — V~ ] 

^ p V101n 2 x 151n 3 x/ 

For x ^ 10372 

E- - lnlnx - B ^ | ^_ H ) 
p V 10 In 2 x 151n 3 x/ 

p^.x x ' 

□ 

The best result to date, assuming the validity of the Riemann Hypothesis (! ), is due 
to SCHOENFELD [T^j, and affirms: 

Theorem 15. If x ^ 13.5, then: 



> In In x — B 

< P 

p^x 

□ 

In both cases, the error term is much better than that of Mertens, himself, but no 
optimal error term has been found. 

Recently, M. Wolf [T7| derived Mertens' series by a completely different method. 
He uses the "generalized Bruns constants" which measure the gaps between consecutive 
primes, and by an ingenious combination of hard rigorous computations and heuristic 
numerical arguments obtains Mertens' series, including the big "O" error term. More- 
over, he prepared a numerical table (which I reproduce with his permission) comparing 
the error term in Theorem 15 with the true error. 



31nx + 4 

8lly/x 
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The Ratio of the True Error to the Predicted Error 



X 


\J2 P< x 1 /p- lo s l °sx - B\ 


31og(a;)+4 


ratio column 2/ column 3 


2 16 = 65536 


2.43328226E-0004 


5.79284588E-0003 


4.20049542E-0002 


2 17 = 131072 


2.26479291E-0004 


4.32469516E-0003 


5.23688450E-0002 


2 18 = 262144 


1.11367788E-0004 


3.21961962E-0003 


3.45903559E-0002 


2 19 = 524288 


1.23916030E-0004 


2.39088215E-0003 


5.18285814E-0002 


2 20 = 1048576 


5.58449145E-0005 


1.77140815E-0003 


3.15257184E-0002 


2 21 = 2097152 


4.63383665E-0005 


1.30970835E-0003 


3.53806756E-0002 


2 22 = 4194304 


3.20736392E-0005 


9.66503244E-0004 


3.31852370E-0002 


2 23 = 8388608 


1.83353157E-0005 


7.11987819E-0004 


2.57522885E-0002 


2 24 = 16777216 


1.10324946E-0005 


5.23651207E-0004 


2.10684030E-0002 


2 25 = 33554432 


1.29876787E-0005 


3.84560730E-0004 


3.37727640E-0002 


2 26 = 67108864 


6.42047777E-0006 


2.82025396E-0004 


2.27656015E-0002 


2 27 = 134217728 


3.69019851E-0006 


2.06563775E-0004 


1.78646934E-0002 


2 28 = 268435456 


3.19579180E-0006 


1.51112594E-0004 


2.11484146E-0002 


2 29 = 536870912 


1. 63321 145E-0006 


1.10423592E-0004 


1.47904212E-0002 


2 30 = 1073741824 


1.72440466E-0006 


8.06062453E-0005 


2.13929411E-0002 


2 31 = 2147483648 


8.53875133E-0007 


5.87826489E-0005 


1.45259723E-0002 


2 32 = 4294967296 


5.34863712E-0007 


4.28280967E-0005 


1.24886173E-0002 


2 33 = 8589934592 


5.56640268E-0007 


3.11767507E-0005 


1.78543387E-0002 


2 34 = 17179869184 


3.62687244E-0007 


2.26765354E-0005 


1.59939443E-0002 


2 35 = 34359738368 


1.45653226E-0007 


1.64810885E-0005 


8.83759748E-0003 


2 36 = 68719476736 


1.16826187E-0007 


1.19695112E-0005 


9.76031397E-0003 


2 37 = 137438953472 


9.94572329E-0008 


8.68690083E-0006 


1.14491042E-0002 


2 38 = 274877906944 


6.52601557E-0008 


6.30037736E-0006 


1.03581344E-0002 


2 39 = 549755813888 


5.50727125E-0008 


4.56662870E-0006 


1.20598183E-0002 


2 40 = 1099511627776 


3.20547296E-0008 


3.30799956E-0006 


9.69006466E-0003 


2 41 = 2199023255552 


1.70901151E-0008 


2.39490349E-0006 


7.13603497E-0003 


2 42 = 4398046511104 


1.95113765E-0008 


1.73290522E-0006 


1.12593442E-0002 


2 43 = 8796093022208 


9.40614690E-0009 


1.25324631E-0006 


7.50542552E-0003 


2 44 = 17592186044416 


3.88364187E-0009 


9.05905329E-0007 


4.28702840E-0003 



It's clear that the error term ratio stays fairly constant, so that the order of magni- 
tude is correct, although the numerical constants in the error formula need considerable 
improvement! 
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